Text-directed speech enhancement employing phone class parsing and feature map constrained vector quantization
نویسندگان
چکیده
There are many situations where non-real-time speech enhancement is required. For such applications, employing any available a priori knowledge can lead to more effective enhancement solutions. In this study, a novel text-directed speech enhancement algorithm is developed for usage in non-real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech into regions of broad phoneme classifications. Classes considered include stops, fricatives, affricates, nasals, vowels, semivowels, diphthongs and silence. These partitions are then used to direct a new vector quantizer based enhancement scheme in which phone-class directed constraints are applied to improve speech quality. The proposed algorithm is evaluated using both objective as well as subjective quality assessment techniques. It is shown that the text-directed approach improves the quality of the degraded speech over a broad range of noise sources (i.e., flat communications channel noise, aircraft cockpit noise, helicopter fly-by noise, and automobile highway noise) and over a broad range of signal-to-noise ratios (i.e., 10, 5, 0 and 5 dB). In each case, the proposed method is shown consistently to exhibit improved objective quality over linear and generalized spectral subtraction, as well as the Auto-LSP constrained iterative enhancement method using the Itakura-Saito measure and a lOO-sentence evaluation speech corpus. Subjective quality assessment was conducted in the form of an A-B comparison test. Results of these evaluations demonstrate that, for wideband noise distortions, the proposed algorithm is preferred over the unprocessed noisy speech more than 2 to 1, while the proposed algorithm is preferred over spectral subtraction by more than 3 to 1.
منابع مشابه
Text-directed speech enhancement using phoneme classification and feature map constrained vector quantization
This paper presents and evaluates a novel text-directed speech enhancement algorithm for usage in non real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech into regions of broad phoneme classiications. Classes considered include stops, fricatives, aaricates, nasals, vowels, semivowels, diphthongs and silence. These partitions are then used ...
متن کاملText - Directed Speech Enhancement Employing
There are many situations where non-real-time speech enhancement is required. For such applications, employing any available a priori knowledge can lead to more eeective enhancement solutions. In this study, a novel text-directed speech enhancement algorithm is developed for usage in non-real-time applications. In our approach, the text of the intended dialogue is used to partition noisy speech...
متن کاملPerformance Analysis of Speech Enhancement Algorithm for Robust Speech Recognition System
Widely Speech Signal Processing has not been used much in the field of electronics and computers due to the complexity and variety of speech signals and sounds with the advent of new technology. However, with modern processes, algorithms, and methods which can proc Demand for speech recognition technology is expected to their mobile phones as all purpose lifestyle devices. In this paper, an imp...
متن کاملFeature extraction in opinion mining through Persian reviews
Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels d...
متن کاملAn axiomatic approach to soft learning vector quantization and clustering
This paper presents an axiomatic approach to soft learning vector quantization (LVQ) and clustering based on reformulation. The reformulation of the fuzzy c-means (FCM) algorithm provides the basis for reformulating entropy-constrained fuzzy clustering (ECFC) algorithms. This analysis indicates that minimization of admissible reformulation functions using gradient descent leads to a broad varie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 21 شماره
صفحات -
تاریخ انتشار 1997